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ABSTRACT 



A method and system for facilitating a keyword search 
request initiated at a client station within a multilevel data 
network, wherein the multilevel data network includes mul- 
tiple local sites each containing multiple data pages. Mul- 
tiple keywords from each of the data pages within the local 
sites of the multilevel data network are stored locally and 
indexed such that each of the keywords points to one or 
more of the data pages in which the keywords are contained. 
The keywords and their index associations are locally 
updated. A central database is utilized to compile and index 
the locally indexed keywords from each of the local sites, 
such that each of the keywords in the central database points 
to one or more local sites from which those keywords came 
in response to a keyword search initiated at the client station. 

40 Claims, 6 Drawing Sheets 



504 500 




02/18/2004, EAST Version: 1.4.1 



U.S. Patent 



Dec. 3, 2002 



Sheet 1 of 6 



US 6,490,575 Bl 





Ol 

evil 



Q. 



fen 




02/18/2004, EAST Version: 1.4.1 



U.S. Patent Dec. 3, 2002 Sheet 2 of 6 US 6,490,575 Bl 



CLIENTS WEB SITES 




02/18/2004, EAST Version: 1.4.1 



U.S. Patent Dec. 3, 2002 



Sheet 3 of 6 



US 6,490,575 Bl 




02/18/2004, EAST Version: 1.4.1 



U.S. Patent Dec. 3, 2002 Sheet 4 of 6 US 6,490,575 Bl 



502 




User 



j-504 



Application 
Search Button 



500 



506 



HTTP 



<Fig. 5 



508 




Local Server 



*LSEA 




"1/510 



520 



Page 



Page 



Page 



Page 



533 



Page 



Page 



512 



/ Tse'b"* \ Local 

' ' i Site B 




522 



5 



534 



Page 



Page 



Page 



Page 



Page 



02/18/2004, EAST version: 1.4.1 



U.S. Patent Dec. 3, 2002 Sheet 5 of 6 US 6,490,575 Bl 




02/18/2004, EAST Version: 1.4.1 



U.S. Patent Dec. 3, 2002 Sheet 6 of 6 US 6,490,575 Bl 



700 



702 



Register with GSE 



704 



Update 
Keyword-to-Site 
Master Index 



706 



Search Local Site 



5i 



708 



Update 
Keyword-to-Web 
Page Index 



<Fig. 7 



800 



( *« y 



55 



802 



804 



Initiate Global 
Keyword Search 



806 



Retrieve Master 
Index Results 



808 



Select Local 
Site/Server 




C Stop y 



812 



02/18/2004, EAST Version: 1.4.1 



US 6,4' 

1 

DISTRIBUTED NETWORK SEARCH ENGINE 

BACKGROUND OF THE INVENTION 

1. Technical Field 

The present invention relates to an improved method and 
system for accessing a network database, and in particular to 
a method and system for efficiently searching a distributed, 
hierarchical network database, such as the World Wide Web 
(WWW). More particularly, the present invention relates to 
improving network search efficiency by distributing search 
engine functionality via links among various public or 
private data networks. 

2. Description of the Related Art 

Network Access to Information 

The development of computerized information resources, 
such as the Internet, allows users of data-processing systems 
to link with other servers and networks, and thus retrieve 
vast amounts of electronic information heretofore unavail- 
able in an electronic medium. The term "Internet" is an 
abbreviation for "Internetwork," and refers commonly to the 
collection of networks and gateways that utilize the TCP/IP 
suite of protocols, which are well-known in the art of 
computer networking. TCP/IP is an acronym for "Transmis- 
sion Control Protocol/Internet Protocol," and is a software 
protocol developed by the Department of Defense for com- 
munication between computers. The Internet can be 
described as a system of geographically distributed com- 
puter networks interconnected by computers executing net- 
working protocols that allow users to interact and share 
information over the networks. Because of such wide-spread 
information sharing, the Internet has thus far generally 
evolved into an "open" system for which developers can 
design software applications for performing specialized 
operations or services, essentially without restriction. 

Electronic information transferred between data- 
processing networks is usually presented in hypertext, a 
metaphor for presenting information in a manner in which 
text, images, sounds, and actions become linked together in 
a complex non-sequential Web of associations that permit 
the user to "browse" or "navigate" through related topics, 
regardless of the presented order of the topics. These links 
are often established by both the author of a hypertext 
document and by the user, depending on the intent of the 
hypertext document. For example, traveling among links to 
the word "iron" in an article displayed within a graphical 
user interface in a data-processing system might lead the 
user to the periodic table of the chemical elements (i.e., 
linked by the word "iron"), or to a reference to the use of iron 
in weapons in Europe in the Dark Ages. The term "hyper- 
text" was coined in the 1960s to describe documents, as 
presented by a computer, that express the nonlinear structure 
of ideas, in contrast to the linear format of books, film, and 
speech. 

The term "hypermedia," on the other hand, more recently 
introduced, is nearly synonymous with "hypertext" but 
focuses on the nontextual components of hypertext, such as 
animation, recorded sound, and video. Hypermedia is the 
integration of graphics, sound, video, or any combination 
thereof into a primarily associative system of information 
storage and retrieval Hypermedia, as well as hypertext, 
especially in an interactive format where choices are con- 
trolled by the user, is structured around the idea of offering 
a working and learning environment that parallels human 
thinking — that is, an environment that allows the user to 
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make associations between topics rather than move sequen- 
tially from one to the next, as in an alphabetic list. 
Hypermedia, as well as hypertext topics, are thus linked in 
a manner that allows the user to jump from one subject to 

5 other related subjects during a search for information. 
Hyper-link information is contained within hypermedia and 
hypertext documents, which allow a user to move back to 
"original" or referring network sites by the mere "click" 
(i.e., with a mouse or other pointing device) of the hyper- 

10 linked topic. 

A typical networked system that utilizes hypertext and 
hypermedia conventions follows a client/server architecture. 
The "client" is a member of a class or group that uses the 
services of another class or group to which it is not related. 

15 Thus, in computing, a client is a process (i.e., roughly a 
program or task) that requests a service provided by another 
program. The client process utilizes the requested service 
without having to "know" any working details about the 
other program or the service itself. In a client/server 

20 architecture, particularly a networked system, a client is 
usually a computer that accesses shared network resources 
provided by another computer system (i.e., a server or 
Internet Service Provider (ISP)). 
A request by a user for news or other information can be 

25 sent by a client application program to a server. A server is 
typically a remote computer system accessible over the 
Internet or other telecommunications medium. The server 
scans and searches for raw (e.g., unprocessed) information 
sources (e.g., newswire feeds or newsgroups). Based upon 

30 such requests by the user, the server presents filtered elec- 
tronic information as server responses to the client process. 
The client process may be active in a first computer system 
communicating with the server process which is active in a 
second computer system, over a telecommunications 

35 medium, thus providing distributed functionality and allow- 
ing multiple clients to take advantage of the information- 
gathering capabilities of the server. 

Client and server communicate with one another utilizing 

4Q the functionality provided by Hypertext-Transfer Protocol 
(HTTP). The World Wide Web (WWW) or, simply, the 
"Web," includes those servers adhering to this standard (i.e., 
HTTP) which are accessible to clients via a computer or 
data-processing system network address such as a Universal 

45 Resource Locator (URL). The network address can be 
referred to as a Universal Resource Locator address. The 
client and server may be coupled to one another via Serial 
Line Internet Protocol (SLIP) or TCP/IP connections for 
high-capacity communication. Active within the client is a 

5Q first process, known as a "browser," which establishes the 
connection with the server and presents information to the 
user. The server itself executes corresponding server soft- 
ware which presents information to the client in the form of 
HTTP responses. The HTTP responses correspond to "Web 

55 pages" constructed from a Hypertext Markup Language 
(HTML), or other server-generated data. Each Web page can 
also be referred to simply as a "page." 

Conventional Search Engine Infrastructure 

60 The evolution of personal computers over the last decade 
has accelerated the Web arid Internet toward useful everyday 
applications. The graphical portion of the World Wide Web 
itself is usually stocked with more than twenty-two million 
"pages" of content, with over one million new pages added 

65 every month. Readily accessible computer software appli- 
cations such as Internet "search engines" provide a means 
for Internet users to track down sites at which information on 
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a topic of interest can be found. A person may type in a BRIEF DESCRIPTION OF THE DRAWINGS 

subject or key word which the search engine utilizes to _ t „ , , , . . - . . 

locate a list of pertinent network sites (i.e., Web sites) and The novel features beheved characteristic of the invention 

Web pages. Thus, with "home pages" published by thou- are set forth appended claims. The invention itself, 

sands of companies, universities, government agencies, 5 ho ^ er ' 35 wel1 L as a P refe ^ d m ,° de <\ f taher objects, 

museums, and municipalities, the Internet can be an invalu- and ^vantages thereof, will best be understood by reference 

able information retrieval resource. The market for Internet t0 follow ™g **uled description of an illustrative 

access and related applications is expanding at an explosive embodiment when read in conjunction with the accompa- 

p ace nymg drawings, wherein: 

All search engine applications available today are 30 FIG * 1 illustrates a client/server architecture for imple- 

equipped with a search-and-find facility that is accessed renting the method and system of the present invention; 

when a user types in a requested search item and "clicks" on FIG. 2 depicts a distributed search engine architecture in 

the application's 'Search' button. The data sought may accordance with the method and system of the present 

potentially be stored at as many as tens of thousands of Web invention; 

pages within thousands of network sites. Each of these Web * 5 FIG. 3 illustrates a computer network with which the 

pages may include hypertext links which point to other sites method and system of the present invention may be prac- 

and/or pages at which related information may be found. ticed; 

The process of searching or browsing the Web is therefore F IG. 4 depicts a data processing system with which the 

an extremely time consuming and computation intensive me thod and system of the present invention may be imple- 

multiple recursive process possible covering many thou- 20 m ented- 

sands of possible Web sites and pages. piG / 5 ^ a high . levd block diagram depicting & sXTibntion 

Conventional search engines maintain internal indues in of ^ eDgine functionality among various network ele- 

which the network addresses of Web sites and pages are ments in accorda nce with the method and system of the 

associated with particular "keywords". When a user types in present invention' 

one or more keywords during a Web search, the search 25 „ T „ , . , ^n. j 

. .. .. 1, j . i ' j . . FIG. 6 is a diagram depicting a search engine GUI utilized 

engine examines its internal keyword index to determine . , ... ,i_ . . 4 c *u 

/* « i « < , . • • > « . j , in accordance with the method and system of the present 

first whether the keyword is present within the index, and if invention* 

so, the addresses of the pages at which the keyword(s) is/are ' 

located. Given the explosive growth of the Internet as an FIG - 7 * a high-level flow diagram illustrating steps 
information repository, storing and updating such an index 30 Performed with a multilevel network database while main- 
is proving burdensome both in terms of information storage tainin 8 a distributed search engine in accordance with the 
capacity and computation bandwidth. . method and svstcm of the P resenl invention; and 

From the foregoing, it can be appreciated that a need FIG - 8 fc a high-level logic diagram depicting steps 

exists for a method and system for strategically distributing performed by network data processing devices while per- 

the search engine functionality across rapidly growing elec- 35 forming a keyword search in accordance with the method 

tronic data networks such as the Intemetjf-implemented, and system of the present invention. 

such'a method and system would~impr ove b oth efficiency 

a V nd comprehensiveness of distnfe^-netwofll DETAILED DESCRIPTION ^OF A PREFERRED 

<^arct^ ~ — ~ ^ EMBODIMENT 

The present invention harnesses the distributed, hierar- 
chical nature of existing Internet infrastructure as embodied 
It is therefore an object of the invention to provide an by Web servers, Internet Service Providers (ISPs), and Web 
improved information-retrieval method and system. sites, to provide an improved method and system for per- 
It is another object of the invention to provide an 45 forming a network search. Such a method and system greatly 
improved method and system for efficiently searching a improve both the precision and comprehensiveness of net- 
distributed, hierarchical network database, such as the World work searches. 

Wide Web (WWW). Conventional search engine applications maintain a cen- 

It is a further object of the invention to improve network tralized keyword index which consumes considerable space 

search efficiency by distributing search engine functionality 50 and requires frequent and time consuming updates. The 

via links among various public or private data networks. problem of traffic overload on conventional search engines 

The above and other objects are achieved as is now caused by such centralized functionality can be eliminated 

described. A method and system are disclosed for facilitating by first migrating and distributing a portion of the searching 

a keyword search request initiated at a client station within and indexing functionality to local sites and servers. In one 

a multilevel data network, wherein the multilevel data 55 embodiment of the present invention, local sites support 

network includes multiple local sites each containing mul- local search engines which perform indexing of all pages 

tiple data pages. Multiple keywords from each of the data maintained at each respective site. A global, top-level search 

pages within the local sites of the multilevel data network engine maintains and periodically updates its own master 

are stored locally and indexed such that each of the key- index. During such updates, the global search engine incor- 

words points to one or more of the data pages in which the 60 porates information from the locally maintained indices at 

keywords are contained. The keyw ords anoMheir index each Web site. 

associations are locally updated. %ce~nTral datab^jguti- In an alternate embodiment, the global search engine 

lized to compile and index the locally indexed keywords would retrieve only the Internet Protocol (IP) address of the 

from each of the local sites, such that each of the keywords local sites associated with word-to-page links relating to the 

in the central database points to one or more local sites from 65 searched words. In this manner, when a user commences a 

which those keywords came in response to a keyword search search, the global search engine responds by providing a list 

initiated at the client station. of sites (site addresses) rather than page addresses. The user 
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may then have the option of visiting sites recovered and ticed Computer network 300 is representative of the 
displayed from the search and commence a localized search Internet, a known computer network based on the client- 
utilizing a local search engine which references its own server model discussed earlier. Conceptually, the Internet 
internal local index as a more accurate and efficient guide for includes a large network of servers 108 which are accessible 
finding the page(s) meeting the search criteria. 5 by clients 102, typically users of personal computers, 

Global search engine (GSE) receives search requests from through some private Internet-access provider 304 (e.g., 

users in a conventional manner — by matching keywords such as Internet America) or an on-line service provider 306 

specified by the user to index entries pointing to addresses (e.g., such as America On-Line, Prodigy, Compuserve, and 

within its own internal index. Each 'Search* button or the like). Each of the clients 102 may run a "browser," which 

hypertext search link in the application now points to a 10 is a known software tool used to access servers 108 via the 

particular HTML search index file residing on the server. access providers. Each server 108 operates a so-called Web 

In FIG. 1, FIG. 2, and FIG. 3, like parts are indicated by site which supports files in the form of documents and pages, 

like reference numerals. FIG. 1 illustrates a client/server A network path to servers 108 is identified by a Universal 

architecture 100 for implementing the method and system of Resource Locator having a known syntax for defining a 

the present invention. In FIG. 1, user search requests 101 are network collection. 

delivered by a client application program 102 to a server Clients 102 are depicted as personal computers, each 

108. Server 108 can be a remote computer system accessible including a system unit 322, a video display terminal 324, an 

over the Internet or other communications medium. Server alphanumeric input device (i.e., keyboard 326) having 

108 performs scanning and searching of raw (e.g., alphanumeric and other keys, and a mouse 328. An addi- 

unprocessed) information sources (e.g., newswire feeds or 2Q tional input device (not shown), such as a trackball or stylus, 

newsgroups) and, based upon these user requests, presents also can be included with clients 102. Clients 102 can be 

the filtered electronic information as server responses 103 to implemented utilizing any suitable computer, such as an 

the client process. The client process may be active in a first IBM Aptiva computer, a product of International Business 

computer system, and the server process may be active in a Machines Corporation, located in Armonk, N.Y. "Aptiva" is 

second computer system and communicate with the first 25 a registered trademark of International Business Machines 

computer system over a communications medium, thus Corporation. 

providing distributed functionality and allowing multiple Although the clients 102 in FIG. 3 are depicted as 

clients to take advantage of the information-gathering capa- personal computers, a preferred embodiment of the present 

bilities of the server. invention may be implemented in other types of data- 

With reference now to FIG. 2, there is depicted a distrib- 30 processing systems, such as, for example, intelligent work- 

uted search engine architecture in accordance with the stations or mini-computers. Clients 102 also preferably 

method and system of the present invention. The client and includes a graphical user interface that resides within a 

server are processes which are generated from a high-level machine -readable media to direct the operation of clients 

programming language (e.g., PERL) that is operative within 102, 

two computer systems. The client and server processes are 35 Turning now to FIG. 4, there is illustrated a typical data 

interpreted and executed by the computer systems at run- processing system 400 in which a preferred embodiment of 

time (e.g., a workstation), and it can be appreciated by one the present invention may be implemented as one of clients 

skilled in the art that they may be implemented in a variety 102. A central processing unit (CPU) 402, such as one of the 

of hardware devices, either programmed or dedicated. pc microprocessors available from International Business 

Client 102 and server 108 communicate using the func- 40 Machines Corporation (IBM), is provided and intercon- 

tionality provided by Hypertext-Transfer Protocol (HTTP). nected to various other components by system bus 401. An 

The term Web, as utilized herein, includes all servers ad her- operating system 428 runs on CPU 402 and provides coor- 

ing to the HTTP standard, which are accessible to clients via dination and control among the various components of data 

a Universal Resource Locator, Active within client 102 is a processing system 400. Operating system 428 may be one of 

first process, browser 212, which establishes the connections 45 the commercially available operating systems such as 

with server 108, and presents information to the user. Any OS/2™ operating system available from IBM. A program 

number of commercially or publicly available browsers may application 430 operates in conjunction with operating sys- 

be used, in various implementations. tem 428, and provides output calls to operating system 428 

Server 108 executes the corresponding server software which implement the various functions to be performed by 

which presents information to the client in the form of HTTP 50 application 430. 

responses 210. The HTTP responses 210 correspond with A read only memory (ROM) 404 is connected to CPU 402 

the Web pages represented using Hypertext Markup Lan- via bus 401 and includes the basic input/output (BIOS) that 

guage (HTML) or other data which is generated by the controls basic computer functions. A random access memory 

server. For example, under the Mosaic-brand browser, in (RAM) 406, I/O adapter 408 and communications adapter 

addition to HTML functionality 204 provided by server 108 5s 422 are also interconnected to system bus 401. It should be 

(i.e., display and retrieval of certain textual and other data noted that software components, such as operating system 

based upon hypertext views and selection of item(s)), a 428 and application 430, are loaded into RAM 406, which 

Common Gateway Interlace (CGI) 206 is provided which operates as the main memory for data processing system 

allows the client program to direct server 108 to commence 400. I/O adapter 408 may be a small computer system 

execution of a specified program contained within server eo interface (SCSI) adapter that communicates with a disk 

108. This may include a search engine which scans received storage device 410. Communications adapter 422 intercon- 

information in the server for presentation to the user con- nects bus 401 with an external network, enabling data 

trolling the client. Using this interface, and HTTP responses processing system 400 to communicate with other such 

210, the server may notify the client of the results of that systems over a local area network (LAN) or wide area 

execution upon completion. 65 network (WAN), such as the Internet. An exemplary WAN 

FIG. 3 illustrates a computer network with which the would comprise one or more of servers 108, ISP 304, or 

method and system of the present invention may be prac- on-line service provider 306. I/O devices are also connected 
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to system bus 401 via a user interface adapter 412 and a requests; and, (3) a finer user search option granularity 

display adapter 424 utilizing various components such as a consisting of at least the following two-step search process: 

digital-to-analog converter (not depicted) and the like. By a ) locating sites and/or servers from a global index; b) 

utilizing the aforementioned I/O devices, a user is capable of searching one or more local sites utilizing localized search 

inputting information to data processing system 400 through 5 engines in response to step a), 

input devices such as a keyboard 414 or a mouse 416, and Turning now t0 BG 5> there is d icted a 51ock di 

receiving output information from the system from a speaker of a server-centric search engine deployment scheme of the 

418 or a visual display screen 426. DSE of the present invention. A user 502 initiates operation 

As further illustrated in FIG. 4, a main memory 470 is of a DS £ 500 by first entering one or more search keywords 

connected to system bus 401, and includes a control program 10 and then activating a search execution button 504. The user 

471. Control program 471 resides within main memory 470 . initiation can be accomplished through a variety of user 

and contains instructions that when executed on CPU 402 interface devices such as keyboard 414 or mouse 416 of data 

carry out the operations depicted in the logic flowchart of processing system 400. In one embodiment of the present 

FIGS. 7 and 8 described herein. The computer program invention, search execution button 504 is displayed within a 

product also can be referred to as a program product. Control 15 graphical user interface (GUI) such as GUI window 600 of 

program 471 can support a number of Internet-access tools piG. 6. Upon activation of search execution button 504, the 

including, for example, an HTTP-compliant Web "browser." user » s search request is converted into a hypertext data 

Known browser software applications include: Netscape format and tne newly converted hypertext search request is 

Navigator® ("Netscape") , Mosaic, and the like. Netscape, transmitted to a global search engine (GSE) 506. 

in particular, provides the functionality specified under 20 As illustrated in FIG. 5, global search engine 506 includes 

HTTP, "Netscape" is a trademark of Netscape, Inc. Mosaic- a mas ter index 514 which contains a central keyword 

brand browser is available from the National Center for database (not depicted). This central keyword database is 

Supercomputing Applications (NCSA) in Urbana- periodically updated via search application program 516 

Champaign, 111. The present invention is designed to operate from data retrieved by servers such as local server 508. The 

with any of these known or developing Web browsers, in 25 periodic updates from search application program 516 to the 

order to achieve the display of information associated with keyword database within master index 514 may occur in 

search engine applications launched from the Internet. response to, or independent from a keyword search request 

It is important to note that, while the present invention has by user 502. In one such scenario, user 502 attempts to 
been (and will continue to be) described in the context of a obtain information relating to a particular topic by specify- 
fully functional computer system, those skilled in the art can 30 ing one or more search keywords within search executable 
appreciate that the present invention is capable of being 504. If the keywords entered by user 502 are currently 
distributed as a program product in a variety of forms and unavailable within the centralized keywords database of 
that the present invention applies equally regardless of the master index 514, user 502 may then launch an advanced 
particular type of signal-bearing media utilized to actually search request by activating an "advanced search" option 
carry out the distribution. Examples of signal-bearing media 35 within search executable 504. This advanced search request 
include: recordable-type media, such as floppy disks, hard- will be automatically converted as usual into a hypertext 
disk drives and CD ROMs, and transmission-type media, data format and forwarded from search application program 
such as digital and analog communication links. 516 to one or more local search engines (LSEs) served by 

Communications adapter 422 may be provided by a local server 508. In the depicted example, local server 508 

network card (not depicted) which can be connected to 40 supports LSE 520 and LSE 522 which are associated with 

system bus 401 in order to link data processing system 400 local network sites 510 and 512 respectively. It should be 

to other data-processing system networks in a client/server noted that the depiction of a single local server serving two 

architecture or to groups of computers and associated local sites is provided in FIG. 5 for the sake of simplicity and 

devices which are connected by communications facilities. clarity of explanation. Many additional local servers serving 

Those skilled in the art will appreciate that the hardware 45 one or more sites may also be registered with GSE 506 

depicted in FIG. 4 may vary for specific applications. For consistent with the spirit and scope of the present invention, 

example, other peripheral devices, such as: optical-disk If the keywords entered by user 502 are currently stored 

media, audio adapters, or chip-programming devices, such within the centralized keyword database of master index 

as PAL or EPROM programming devices and the like also 514, GSE 506, supported from a network server, retrieves 

may be utilized in addition to or in place of the hardware 50 and delivers resultant data from master index 514 into a 

already depicted. Note that any or all of the above compo- "search result" GUI within the client station on which search 

nents and associated hardware may be utilized in various executable 504 resides. In a preferred embodiment of the 

embodiments. However, it can be appreciated that any present invention, such resultant data includes the identity 

configuration of the aforementioned system may be used for and network addresses of network sites containing one or 

various purposes according to a particular implementation. 55 more of the searched keywords. Therefore, in response to 

Distributed Search Engine Architecture receiving a keyword search request from search executable 

Based on the multi-layer nature of the World Wide Web, 504, GSE 506 "points to" sites which are associated with the 

a distributed search engine (DSE) infrastructure is proposed selected keywords within master index 514 and provides 

which leverages the hierarchical nature of data organization these results to user 502 via a search result GUI which is 

on the Web. FIGS. 5 through 7 illustrate possible imple- 60 described in greater detain with reference to FIG. 6. 

mentations of such a DSE in which local search engines As depicted in FIG. 5, local sites 510 and 512 are World 

assume local indexing responsibilities. Central to the pro- Wide Web (WWW) sites each comprising a collection of 

posed DSE are the following three key innovations: (1) related HTML documents commonly referred to as "Web 

implementing local search engines at local servers or sites pages". Web pages 532 and 534 are, contained within sites 

which maintain and update local indices; (2) a top-level 65 510 and 512 respectively, while Web page 533 is shared by 

global search engine which utilizes such local indexing to both. The depicted Web pages within sites 510 and 512 are 

point to servers or sites in response to keyword search documents consisting of an HTML file which have associ- 
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ated files for scripts and graphics in a particular directory or 
machine (not depicted). Such Web pages often include 
hypertext links to other Web pages. 

In accordance with an important feature of the present 
invention, eachofrlp£aj^ites;510and-512.have.anassociated 
local-datab ase~ which-maintains a4isT^keywords^c^p2led 
frorfPwitriin each^ite7TSEs~520~1ind 522 include such 
keyword databases within a pair of local indices 524 and 
526. LSEs 520 and 522 also include local search application 

programs 528 and 530 which serve to update the list of 30 such localized searches are displayed within a secondary 
keywords maintained within the keyword databases of local search result window 610 which is linked to local indices 
indices 524 and 526. In a preferred embodiment of the maintained and updated by the local search engines (local 
present invention, local indices include processing means indices 524 and 526 of FIG. 5, for example), such that upon 
for indexing the current keyword lists such that each of the selection of a local site search executable 614, the corre- 
keywords is associated with one or more of the multiple Web is sponding local search engine retrieves and delivers search 



local site search executable button 614. An additional key- 
word entry field 616 may also be provided within search 
result window 608. Each of the local search site executable 
buttons is linked to at least one of the local sites served by 
the local search engines of the present invention. In this 
manner the present invention provides the user with the 
option of either visiting a selected site directly, or conduct- 
ing a further keyword search of one or more of the local sites 
displayed within search result window 608. Results from 



pages within each respective site. In this manner, the dis- 
tributed search engine of the present invention delegates 
much of the indexing functionality conventionally provided 
by the top level search engine to local search engines 
residing on local servers. This improvement is increasingly 20 
needed as the recursive nature of mapping keywords to 
multiple URLs has caused "top-level indexing" to become 
nearly unmanageable. 



results from the local index to secondary search result 
window 610. It should be noted that these secondary results 
will include links to specified Web pages within the local 
sites. 

If, as mentioned with reference to FIG. 5 above, the 
requested keywords are not currently within master index 
514, or the search result is otherwise deemed insufficient by 
user 502, user 502 may select a flag (not depicted) within 
GUI 600 which will then be automatically forwarded to 



The updated data within local indices 524 and 526 are 

converted into a suitable hypertext format and delivered 25 search application 516 and automatically relayed to local 

automatically, or in response to a user request to GSE 506 search engines 520 and 522. Local server 508, from which 

via local server 508. In the latter case user 502 may access the local search engines operate, takes note of the flag, and 

the most recent keyword search information by utilizing an facilitates the hypertext data transfers necessary to com- 

" advanced search" feature within a graphical user interface me nee local searching by LSEs 520 and 522. In this manner 

(GUI) such as that illustrated in FIG. 6. Such an user-driven 30 on-demand local searching may be initiated from GUI 600 



update may be performed even before a periodic update of 
the keyword database of master index 514 has occurred. 

The clients and servers depicted in FIGS, 1, 2, and 3, 
typically display browsers and other Internet data for a user 
via a graphical user interface (GUI) such as GUI 600 
illustrated in FIG. 6. GUI 600 utilizes a well-known type of 
display format that enables a user to choose commands, start 
programs, and see lists of files and other options by pointing 
to pictorial representations (icons) and lists of menu items 



resulting in a manual update of master index 514 and search 
results provided to a search result window within GUI 600. 

FIG. 7 is a high-level flow diagram 700 illustrating steps 
performed with a multilevel network database while main- 
35 taining a distributed search engine in accordance with the 
method and system of the present invention. As depicted 
step 702 of diagram 700, the distributed search engine 
originates with local Web sites registering with a global 
search engine (GSE) such as GSE 506 of FIG. 5. After being 
on the screen. Choices can be activated generally either with 40 registered, and as shown at step 704, each of these "member 
a keyboard or a mouse. Internet services may be accessed sites" provides or otherwise makes available to the GSE, a 
within GUI 600 by specifying a unique network address list of its own keywords which indexed within a master 
(i.e., URL). The URL address has two basic components, the index such as master index 514. In a preferred embodiment 
protocol to be used and the object pathname. For example, of the present invention, such indexing entails associating 
the URL address, "http://www.uspto.gov" (i.e., home page 45 each keyword with the network address of its local site or 
for the U.S. Patent and Trademark Office), specifies a server. In this manner, a single client search request to a GSE 
hypertext-transfer protocol ("http") and a pathname of the results in user access to a centralized and comprehensive list 
server ("www.uspto.gov"). The server name is associated of keyword references. 

with a unique numeric value (TCP/IP address). A "Web The lists of keywords may be periodically updated auto- 
browser" is a well known type of GUI which may be utilized 50 matically or, as illustrated by steps 706 and 704, such 
to support the utilities of GUI 600 in accordance with the updates may occur in response to local searches of one or 
teachings of the present invention. more local sites conducted by a local search engine (LSE). 

As illustrated in FIG. 6, GUI 600 comprises a keyword Proceeding to step 708, the local searching feature depicted 
entry field 604, a search application button 606, and a search at step 706 also permits updates for indices maintained in 
result window 608. Search application button 606 is linked 55 association with local sites and/or local servers. These local 



to a centralized search database such as master index 514, 
such that upon selection of search application button 606, 
the GSE, such as GSE 506, retrieves and delivers data from 
the centralized database into search result window 608. 
Selection of search executable 504 thus initiates a top-level 
search in which GSE 506 is provided with search instruc- 
tions in accordance with the keywords typed into keyword 
entry field 604. Search result window 608 displays the 
search results in a search result field 612. As depicted in FIG. 
6, and in a preferred embodiment of the present invention, 
the results displayed within search result field 612 includes 
a list of local sites hypertext links each having an associated 



indices associate keywords contained within each site to the 
Web page addresses at which the keywords or related 
information can be found. Such local index maintenance 
results in a dramatic time and resource bandwidth savings on 
60 the part of the GSE while continuing to provide a compre- 
hensive search engine index. 

FIG. 8 is a high-level logic diagram 800 depicting steps 
performed by network data processing devices while per- 
forming a keyword search in accordance with the method 
65 and system of the present invention. Following start block 
802, a keyword search commences as depicted at step 804. 
A keyword search request is initiated by a user from a client 
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station such as data processing system 400, A typical search wherein said multilevel data network includes a plurality of 

request such as that depicted at step 804 is initiated utilizing local sites each containing a plurality of data pages, said 

a keyword selection field (keyword entry field 604, for method comprising the steps of: 

example), in conjunction with a search engine executable within each of said plurality of local sites, indexing 

such as search application button 606. A GSE receives the 5 keywords from said plurality of data pages within a 

search request initiated as shown at step 804 and retrieves local database, such that within said local database, 

master index results in response thereto as illustrated at step each of said keywords points to one or more of said 

806. As explained with reference to master index 514 of plurality of data pages; 

FIG. 5, the master index maintains a comprehensive list of . compiling and indexing said keywords from each local 

keywords and associates each keyword with one or more 1Q database within a central database, such that within said 

local sites. . central database, each of said keywords points only to 

The results retrieved from the master index are presented at lMCt _„ ftfc -j lr . • „ , 

. l4 „ IIT . . t , . „.„ t . . • at least one or said local sites in response to a keyword 

within a search result GUI as depicted in FIG. 6. In these c ^ . „„, lflc . ( \ n u\^ a A „t 

u 4U - a *u u j i _j . search request initiated at said client station, 

results, the master index utilizes the searched keywords to - ™ 4 r , c . . , . .... . , 4 

point to one or more local sites and/or servers. As shown at 2 ' ™ e f m * hod of daim ^ he , rein s f aid ^ 
step 808 a user may locate Web relevant Web pages by " network further compnses a coUection of mterhnked hyper- 
selecting a local search option associated with each of the text documents, and wherein said compiling and indexing 
local site "hits" retrieved at step 806. The selection depicted ste P s are performed utilizing a hypertext transfer protocol, 
at step 808 may be performed by selecting a local site search 3 - ^ method of claim x > wherein each of said plurality 
executable such as local site search executable buttons 614 of local Sltes 15 served by a local server, and wherein said 
of FIG. 6. Finally, steps 810, 808, and 812 illustrate the 20 compiling and indexing step comprises the step of associ- 
process by which a user may continue searching the local atm g eacn of said keywords with one or more of said local 
sites identified at step 806. servers. 

Preferred implementations of the invention include imple- The method of claim 1, wherein said compiling and 

mentations as a computer system programmed to execute the indexing keywords from said plurality of data pages into a 

method or methods described herein, and as a program 25 local debase, further comprises the step of individually 

product. According to the computer system implementation, searching each of said local databases for occurrences of 

sets of instructions for executing the method and methods sa id keywords. 

are resident in RAM of one or more computer systems 5. The method of claim 1, wherein said multilevel data 

configured generally as described above. Until required by network further comprises a global search engine accessible 

the computer system, the set of instructions may be stored as 30 from a searc h graphic user interface (GUI) having a search 

a computer-program product in another computer memory, executable and a search result window within said client 

for example, in a disk drive (which may include a removable station, said method further comprising the steps of: 

memory such as an optical disk or floppy disk for eventual receiving said keyword search request at said global 

utilization in disk drive). search engine; and 

The computer-program product can also be stored at 35 retrieving and delivering data from said central database 

another computer and transmitted when desired to the user's into said search result window in response to said 

workstation by a network or by an external communications receiving step. 

network. One skilled in the art can appreciate that the 6. The method of claim 5, wherein said global search 

physical storage of the sets of instructions physically engine is served by a network server, and wherein said 

changes the medium upon which it is stored so that the 40 retrieving step comprises the step of pointing to at least one 

medium carries computer-readable information. The change of said local sites utilizing said network server, 

may be electrical, magnetic, chemical, or some other physi- 7. The method of claim 5, wherein said search result 

cal change. While it is convenient to describe the invention window further comprises a local site search executable 

in terms of instructions, symbols, characters, or the like, the linked to said local search engines, said method further 

reader should remember that all of these and similar terms 45 comprising the step of initiating a search by at least one of 

should be associated with the appropriate physical elements. said local search engines of at least one of said local sites. 

Thus, a method for implementing the steps described within 8. The method of claim 5, wherein said keyword search 

reference to FIGS. 5, 6, 7, and 8 can be accomplished with request includes the step of selecting a keyword search 

a computer-aided device. In such a method, data stored in a request from an application search button within said search 

memory unit of a data-processing system such as a data- 50 GUI, said search executable including a HTTP pathname, 

processing system, can represent steps in a method for 9. The method of claim 5, wherein said keyword search 

implementing a preferred embodiment of the present inven- request comprises the step of converting said keyword 

tion. search request into a data format readable by said multilevel 

While the invention has been particularly shown as data network, 

described with reference to a preferred embodiment, it will 55 10. The method of claim 1, further comprising locally 

be understood by those skilled in the art that various changes updating said plurality of keywords within each of said local 

in form and detail may be made therein without departing indices. 

from the spirit and scope of the invention. For example, the 11. The method of claim 10, wherein said multilevel data 

present invention is applicable to other communication network further comprises local search engines associated 

networks besides the Internet, including "intranets" (i.e., 60 with each of said plurality of local sites, and wherein said 

networks internal to particular organizations). It is therefore step of updating said keywords contained within each of said 

contemplated that such modifications can be. made without local indices is performed utilizing said local search engines, 

departing from the spirit or scope of the present invention as 12. The method of claim 11, further comprising the step 

defined in the appended claims. of installing said local search engines as HTML search files 

What is claimed is: 65 on at least one local server. 

1. A method for facilitating a keyword search request 13. A method for facilitating a keyword search request 

initiated at a client station within a multilevel data network, initiated at a client station within a multilevel data network, 
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wherein said multilevel data network includes a plurality of 
local sites each containing a plurality of data pages, said 
method comprising the steps of: 

within each of said plurality of local sites, indexing 
keywords from said plurality of data pages within a 
local database, such that within said local database, 
each of said keywords points to one or more of said 
plurality of data pages; 

compiling and indexing said keywords from each local 
database within a central database, such that within said 
central database, each of said keywords points only to 
at least one of said local sites; 

responsive to receiving an initial keyword search request 
from a graphical user interface on said client station, 
searching said central database for local sites indexed 
in accordance with the contents of said initial keyword 
search request; 

returning a list of one or more of said local sites indexed 
in accordance with the contents of said initial keyword 
search request to a search result window within said 
graphical user interface, wherein said search result 
window includes a keyword entry field and an inde- 
pendent search request selection option field associated 
with each entry of said returned one or more local sites; 
and 

responsive to receiving a subsequent keyword search 
request issued in accordance with the contents of said 
search result window keyword entry field and selection 
of one or more of said independent search request 
selection fields, searching said local databases for data 
pages indexed in accordance with the contents of said 
subsequent keyword search request. 

14. A system for facilitating a keyword search request 
initiated at a client station within a multilevel data network, 
wherein said multilevel data network includes a plurality of 
local sites each containing a plurality of data pages, said 
system comprising: 

a plurality of local databases each uniquely associated 
with each of said local sites for indexing keywords 
from said plurality of data pages, such that within each 
of said local databases said keywords point to one or 
more of said plurality of data pages; 

a central database for compiling and indexing said key- 
words from each of said local databases, such that 
within said central database, each of said keywords 
points only to at least one of said local sites; and 

a global search engine for accessing said central database 
to point to at least one of said plurality of local sites in 
response to a keyword search request initiated at said 
client station, such that said global search engine may 
provide a comprehensive search response to said key- 
word search request. 

15. The system of claim 14, wherein each of said local 
databases further comprises a local index for associating 55 
each of said keywords compiled from within each local site 
with one or more of said plurality of data pages contained 
within each respective local site. 

16. The system of claim 14, wherein said central database 
further comprises a master index for associating each of said 60 
keywords with one or more sites among said plurality of 
local sites. 

17. The system of claim 14, wherein said multilevel data 
network comprises a collection of interlinked hypertext 
documents. 

18. The system of claim 17, wherein said plurality of data 
pages are Web pages. 



45 



50 
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19. The system of claim 14, wherein said global search 
engine and said local search engines utilize a hypertext data 
format. 

20. The system of claim 14, wherein each of said local 
sites is served by a local server at which an associated local 
search engine resides as an off-loaded search engine appli- 
cation. 

21. The system of claim 20, wherein said central database 
further comprises a master index for associating each of said 
keywords with one or more of said servers which serve said 
local sites. 

22. The system of claim 20, wherein said local servers 
serving each of said local sites support and maintain said 
local databases. 

23. The system of claim 14, wherein said global search 
engine comprises a search graphical user interface (GUI) 
which resides at said client station and which includes a 
keyword entry field and a search executable. 

24. The system of claim 23, wherein said global search 
engine further comprises a search result window within said 
search GUI and linked to said central database, such that 
upon selection of the search executable, said global search 
engine retrieves and delivers data from said central database 
into said search result window. 

25. The system of claim 24, wherein said search result 
window further comprises a local site search executable 
linked to at least one of said local search engines for 
initiating a search by said at least one local search engine of 
one or more of said local sites. 

26. The system of claim 25, wherein said search result 
window further comprises a secondary search result window 
that is linked to said local database, wherein upon selection 
of said local site search executable, said local search engine 
retrieves and delivers data from said local database into said 
secondary search result window. 

27. The system of claim 14, further comprising a local 
search engine associated with each of said local sites for 
updating said list of keywords contained within each of said 
local databases. 

28. A system for facilitating a keyword search request 
initiated at a client station within a multilevel data network, 
wherein said multilevel data network includes a plurality of 
local sites each containing a plurality of data pages, said 
system comprising: 

a plurality of local databases each uniquely associated 
with each of said local sites for indexing keywords 
from said plurality of data pages, such that each of said 
keywords points to one or more of said plurality of data 
pages; 

a central database for compiling and indexing said key- 
words from each of said local databases, such that 
within said central database, each of said keywords 
points only to at least one of said local sites; and 
a global search engine for: 
responsive to receiving an initial keyword search 
request from a graphical user interface on said client 
station, searching said central database for local sites 
indexed in accordance with the contents of said 
initial keyword search request; and 
returning a list of one or more of said local sites 
indexed in accordance with the contents of said 
keyword search request to a search result window 
within said graphical user interface, wherein said 
search result window includes a keyword entry field 
and an independent search request selection field 
associated with each entry of said returned one or 
more local sites; and 
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local search engines responsive to receiving a subsequent 
keyword search request issued in accordance with the 
contents of said search result window keyword entry 
field and selection of one or more of said independent 
search request selection fields, for searching said local 
databases for data pages indexed in accordance with the 
contents of said subsequent keyword search request. 

29. A computer program product stored in signal bearing 
media for facilitating a keyword search request initiated at a 
client station within a multilevel data network, wherein said 
multilevel data network includes a plurality of local sites 
each containing a plurality of data pages, said program 
product comprising: 

instruction means stored in signal bearing media for, 
within each of said plurality of local sites, indexing 15 
keywords from said plurality of data pages into local 
databases, such that within said local databases, each of 
said keywords points to one or more of said plurality of 
data pages; 

instruction means stored in signal bearing media for 
compiling and indexing said keywords from each of 
said local databases into a central database, such that 
within said central database, each of said keywords 
-points only to at least one of said local sites in response 
to a keyword search request initiated at said client 
station. 

30. The program product of claim 29, wherein said 
multilevel data network further comprises a collection of 
interlinked hypertext documents, and wherein said instruc- 
tion means for compiling and indexing utilize a hypertext 
transfer protocol. 

31. The program product of claim 30, wherein each of 
said plurality of local sites is served by a local server, and 
wherein said instruction means for compiling and indexing 
comprises instruction means for associating each of said 
keywords with one or more of said local servers. 

32. The program product of claim 29, wherein said 
instruction means for compiling and indexing keywords 
from said plurality of data pages into a local database, 
further comprises instruction means for individually search- 
ing each of said local databases for occurrences of said 
keywords. 

33. The program product of claim 29, wherein said 
multilevel data network further comprises a global search 
engine accessible from a search graphic user interface (GUI) 
having a search executable and a search result window 
within said client station, said program product further 
comprising: 

instruction means for receiving said keyword search 

request at said global search engine; and 
instruction means for retrieving and delivering data from 

said central database into said search result window. 

34. The program product of claim 33, wherein said global 
search engine is served by a network server, and wherein 55 
said instruction means for retrieving comprises instruction 
means for pointing to at least one of said local sites utilizing 
said network server. 

35. The program product of claim 33, wherein said search 
result window further comprises a local site search execut- 
able linked to said local search engines, said program 
product further comprising instruction means for initiating a 



search by at least one of said local search engines of at least 
one of said local sites. 

36. The program product of claim 33, further comprising 
instruction means for converting said keyword search 
request into a data format readable by said multilevel data 
network. 

37. The program product of claim 29, wherein said 
multilevel data network further comprises local search 
engines associated with each of said plurality of local sites, 
and wherein said instruction means for updating said key- 
words contained within each of said local databases are 
executed by said local search engines. 

38. The program product of claim 27, further comprising 
instruction means for installing said local search engines as 
HTML search files on at least one of said local servers. 

39. The program product of claim 29, further comprising 
instruction means stored in signal bearing media for locally 
updating said plurality of keywords within each of said local 
indices. 

40. A computer program product stored in signal bearing 
media for facilitating a keyword search request initiated at a 
client station within a multilevel data network, wherein said 
multilevel data network includes a plurality of local sites 

25 each containing a plurality of data pages, said program 
product comprising: 

instruction means stored in signal bearing media for, 
within each of said plurality of local sites, indexing 
keywords from said plurality of data pages into local 
databases, such that within said local databases, each of 
said keywords points to one or more of said plurality of 
data pages; 

instruction means stored in signal bearing media for 
compiling and indexing said keywords from each of 
said local databases into a central database, such that 
within said central database, each of said keywords 
points only to at least one of said local sites; 
instruction means stored in signal bearing media respon- 
sive to receiving an initial keyword search request from 
a graphical user interface on said client station, for 
searching said central database for local sites indexed 
in accordance with the contents of said initial keyword 
search request; 
instruction means stored in signal bearing media for 
returning a list of one or more of said local sites 
indexed in accordance with the contents of said key- 
word search request to a search result window within 
said graphical user interface, wherein said search result 
window includes a keyword entry field and an inde- 
pendent search request selection field associated with 
each entry of said returned one or more local sites; and 
instruction means stored in signal bearing media respon- 
sive to receiving a subsequent keyword search request 
issued in accordance with the contents of said search 
result window keyword entry field and selection of one 
or more of said independent search request selection 
fields, for searching said local databases for data pages 
indexed in accordance with the contents of said sub- 
60 sequent keyword search request. 
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